5 research outputs found
JSI-GAN: GAN-Based Joint Super-Resolution and Inverse Tone-Mapping with Pixel-Wise Task-Specific Filters for UHD HDR Video
Joint learning of super-resolution (SR) and inverse tone-mapping (ITM) has
been explored recently, to convert legacy low resolution (LR) standard dynamic
range (SDR) videos to high resolution (HR) high dynamic range (HDR) videos for
the growing need of UHD HDR TV/broadcasting applications. However, previous
CNN-based methods directly reconstruct the HR HDR frames from LR SDR frames,
and are only trained with a simple L2 loss. In this paper, we take a
divide-and-conquer approach in designing a novel GAN-based joint SR-ITM
network, called JSI-GAN, which is composed of three task-specific subnets: an
image reconstruction subnet, a detail restoration (DR) subnet and a local
contrast enhancement (LCE) subnet. We delicately design these subnets so that
they are appropriately trained for the intended purpose, learning a pair of
pixel-wise 1D separable filters via the DR subnet for detail restoration and a
pixel-wise 2D local filter by the LCE subnet for contrast enhancement.
Moreover, to train the JSI-GAN effectively, we propose a novel detail GAN loss
alongside the conventional GAN loss, which helps enhancing both local details
and contrasts to reconstruct high quality HR HDR results. When all subnets are
jointly trained well, the predicted HR HDR results of higher quality are
obtained with at least 0.41 dB gain in PSNR over those generated by the
previous methods.Comment: The first two authors contributed equally to this work. Accepted at
AAAI 2020. (Camera-ready version
FISR: Deep Joint Frame Interpolation and Super-Resolution with a Multi-Scale Temporal Loss
Super-resolution (SR) has been widely used to convert low-resolution legacy videos to high-resolution (HR) ones, to suit the increasing resolution of displays (e.g. UHD TVs). However, it becomes easier for humans to notice motion artifacts (e.g. motion judder) in HR videos being rendered on larger-sized display devices. Thus, broadcasting standards support higher frame rates for UHD (Ultra High Definition) videos (4K@60 fps, 8K@120 fps), meaning that applying SR only is insufficient to produce genuine high quality videos. Hence, to up-convert legacy videos for realistic applications, not only SR but also video frame interpolation (VFI) is necessitated. In this paper, we first propose a joint VFI-SR framework for up-scaling the spatio-temporal resolution of videos from 2K 30 fps to 4K 60 fps. For this, we propose a novel training scheme with a multi-scale temporal loss that imposes temporal regularization on the input video sequence, which can be applied to any general video-related task. The proposed structure is analyzed in depth with extensive experiments